26 research outputs found

    A modulation property of time-frequency derivatives of filtered phase and its application to aperiodicity and fo estimation

    Full text link
    We introduce a simple and linear SNR (strictly speaking, periodic to random power ratio) estimator (0dB to 80dB without additional calibration/linearization) for providing reliable descriptions of aperiodicity in speech corpus. The main idea of this method is to estimate the background random noise level without directly extracting the background noise. The proposed method is applicable to a wide variety of time windowing functions with very low sidelobe levels. The estimate combines the frequency derivative and the time-frequency derivative of the mapping from filter center frequency to the output instantaneous frequency. This procedure can replace the periodicity detection and aperiodicity estimation subsystems of recently introduced open source vocoder, YANG vocoder. Source code of MATLAB implementation of this method will also be open sourced.Comment: 8 pages 9 figures, Submitted and accepted in Interspeech201

    An objective test tool for pitch extractors' response attributes

    Full text link
    We propose an objective measurement method for pitch extractors' responses to frequency-modulated signals. It enables us to evaluate different pitch extractors with unified criteria. The method uses extended time-stretched pulses combined by binary orthogonal sequences. It provides simultaneous measurement results consisting of the linear and the non-linear time-invariant responses and random and time-varying responses. We tested representative pitch extractors using fundamental frequencies spanning 80~Hz to 400~Hz with 1/48 octave steps and produced more than 1000 modulation frequency response plots. We found that making scientific visualization by animating these plots enables us to understand different pitch extractors' behavior at once. Such efficient and effortless inspection is impossible by inspecting all individual plots. The proposed measurement method with visualization leads to further improvement of the performance of one of the extractors mentioned above. In other words, our procedure turns the specific pitch extractor into the best reliable measuring equipment that is crucial for scientific research. We open-sourced MATLAB codes of the proposed objective measurement method and visualization procedure.Comment: 5 pages, 9 figures, submitted to Interspeech2022. arXiv admin note: text overlap with arXiv:2111.0362

    An interference-free representation of group delay for periodic signals

    Get PDF
    Abstract-This article introduces a new group delay representation for periodic signals. The proposed method yields a group delay representation that is free from interferences due to repetitive excitation. Power spectrum-weighted averaged group delay using shifted copies of the weighted group delay separated by a half fundamental frequency is proven to have the desired property

    Frequency domain variant of Velvet noise and its application to acoustic measurements

    Full text link
    We propose a new family of test signals for acoustic measurements such as impulse response, nonlinearity, and the effects of background noise. The proposed family complements difficulties in existing families, the Swept-Sine (SS), pseudo-random noise such as the maximum length sequence (MLS). The proposed family uses the frequency domain variant of the Velvet noise (FVN) as its building block. An FVN is an impulse response of an all-pass filter and yields the unit impulse when convolved with the time-reversed version of itself. In this respect, FVN is a member of the time-stretched pulse (TSP) in the broadest sense. The high degree of freedom in designing an FVN opens a vast range of applications in acoustic measurement. We introduce the following applications and their specific procedures, among other possibilities. They are as follows. a) Spectrum shaping adaptive to background noise. b) Simultaneous measurement of impulse responses of multiple acoustic paths. d) Simultaneous measurement of linear and nonlinear components of an acoustic path. e) Automatic procedure for time axis alignment of the source and the receiver when they are using independent clocks in acoustic impulse response measurement. We implemented a reference measurement tool equipped with all these procedures. The MATLAB source code and related materials are open-sourced and placed in a GitHub repository.Comment: 10 pages, 14 figures, APSIPA ASC 2019. arXiv admin note: text overlap with arXiv:1806.0681

    Error Evaluation of an F0-Adaptive Spectral Envelope Estimator in Robustness against the Additive Noise and F0 Error

    No full text

    D4C, a band-aperiodicity estimator for high-quality speech synthesis

    Get PDF
    AbstractAn algorithm is proposed for estimating the band aperiodicity of speech signals, where “aperiodicity” is defined as the power ratio between the speech signal and the aperiodic component of the signal. Since this power ratio depends on the frequency band, the aperiodicity should be given for several frequency bands. The proposed D4C (Definitive Decomposition Derived Dirt-Cheap) estimator is based on an extension of a temporally static group delay representation of periodic signals. In this paper, the principle and algorithm of D4C are explained, and its effectiveness is discussed with reference to objective and subjective evaluations. Evaluation results indicate that a speech synthesis system using D4C can synthesize natural speech better than ones using other algorithms

    Modification of Velvet Noise for Speech Waveform Generation by Using Vocoder-Based Speech Synthesizer

    No full text

    Development of exploratory research tools based on TANDEM-STRAIGHT

    Get PDF
    This article introduces a new set of tools based on TANDEM-STRAIGHT, a fundamental reformulation of STRAIGHT, a speech analysis, modification and resynthesis system introduced in 1997. STRAIGHT has been used in a wide range of speech-related research as a flexible tool for implementing experiments and applications though its scientific foundation was not well established. TANDEM-STRAIGHT introduced a solid basis to resolve this difficulty while preserving the underlying concept of STRAIGHT. TANDEM is a procedure for estimating a temporally static power spectral representation of periodic signals. Together with this representation, the consistent sampling theory enabled complete reformulation of entire algorithms of STRAIGHT and led to a new implementation. This new implementation was applied to develop a set of new tools: temporally variable multi-aspect speech morphing tools, a graphical user interface (GUI) for manipulating STRAIGHT and morphing parameters, and procedures for preparing stimuli for perceptual experiments.APSIPA ASC 2009: Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference. 4-7 October 2009. Sapporo, Japan. Oral session: Infrastructure Software for Speech Processing (5 October 2009)
    corecore